Student Team: NO
Approximately
how many hours were spent working on this submission in total?
30 hours for exploration and 24 hours for reporting
May we post
your submission in the Visual Analytics Benchmark Repository after VAST
Challenge 2014 is complete? YES
Video:
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Questions
MC2.1 – Describe common daily routines for
GAStech employees. What does a day in the life of a typical GAStech employee
look like? Please limit your response to
no more than five images and 300 words.
Figure 1.1. The map shows repeatedly visited
individual and public places extracted from the trajectories. The places have
been semantically interpreted and classified based on the visit times and,
where appropriate, card transaction data.
Figure 1.2. The time histograms represent the weekly
temporal patterns of visiting different kinds of places, showing the typical
locations/activities of the people in different hours of the week.
Figure 1.3. The place types, called semantic places, have been arranged in a spatial layout called semantic space. The movements between
the geographic locations have been semantically abstracted and summarized into
flows between semantic places. The widths of the flow symbols are proportional
to the total counts of the moves between the respective types of places.
Figure 1.4. The flows summarizing the trajectories dynamically
react to filtering of the trajectories. We have selected the trajectories of
the trucks (A), of the cars on the week days (B), and of the cars on the
weekend (C), to see the respective typical movements. Note the states of the
dynamic query and focuser for the maximal arrow width.
Figure 1.5. We have computed the flow intensities by hours of the
week and clustered the hours by similarity of the flow intensities. The time
arranger shows the weekly temporal pattern of the color-coded clusters. The
small multiple maps represent the averaged flows corresponding to each time
cluster, showing what movements are typical for different time intervals.
Hence, a typical week day routine
is home – breakfast/coffee (between 7 and 8 o’clock) – work – lunch (12-13
o’clock) – work – home (17-18 o’clock) – {dinner, shopping, visits of
colleagues} (18-20 o’clock) – home. On the weekend, people go frequently to
restaurants, less frequently to shops, and occasionally to sport places or
museums. Trucks move in the week days between the work (GasTech), various
companies, and airport; sometimes they go to restaurants for lunch.
MC2.2 – Identify up to twelve unusual events
or patterns that you see in the data. If you identify more than twelve patterns
during your analysis, focus your answer on the patterns you consider to be most
important for further investigation to help find the missing staff members. For
each pattern or event you identify, describe
a.
What
is the pattern or event you observe?
b.
Who
is involved?
c.
What
locations are involved?
d.
When
does the pattern or event take place?
e.
Why
is this pattern or event significant?
f.
What
is your level of confidence about this pattern or event? Why?
Please
limit your answer to no more than twelve images and 1500 words.
1) Repeated visits to 5 unknown places by 4 security
employees
Bodrogi, Ferro, Mies (Minke), and
Osvaldo, all having employment type “Security”, repeatedly visited 5 places
labeled in Figure 1 as BFMO-1 to BFMO-5. These places do not correspond to any
local businesses or homes of any employees. The places were usually visited before
lunch.
Figure 2.1. The table view shows when, by
whom, and how long each place was visited. There were 5 cases when two or three
persons were in the same place simultaneously: Bodrogi + Mies at BFMO-1 on
08/01/2014, Ferro + Osvaldo at BFMO-3 on 10/01/2014; Osvaldo + Bodrogi + Ferro
at BFMO-4 on 15/01/2014; Mies + Osvaldo at BFMO-2 on 16/01/2014; Bodrogi +
Ferro at BFMO-1 on 17/01/2014.
Figure 2.2. By filtering through the semantic space map, we have
selected only those daily trajectories that include visits to BFMO places.
There are 30 such trajectories, which are also shown in a summarized form as a
set of flows. The map shows that the visitors typically went to the BFMO places
from the work and after that went for lunch, which means that the BFMO places
are not lunch places.
2) Night visits of security employees to colleagues
Bodrogi, Mies, Osvaldo, and Isia
Vann, all having employment type “Security”, visited their colleagues
Campo-Corrente, Strum, Vasco-Pais, and Barranco in the night time. All these
visited persons have the employment type “Executive”. Each time there were two
visitors, one coming shortly after 23 o’clock and the other shortly after
03:30AM. There were 4 such cases in total. In one case, both visitors stayed
until about 07:30. In three other cases, the first visitor left shortly before
the second visitor arrived.
Figure 2.3. The table view shows all visits to colleagues’ homes
selected by filtering through the semantic space map. Azada, evidently, had a
party in his home on January 10, when he was visited by many colleagues.
Suspicious night visits are highlighted in bold. By comparing Figure 2.3 with
Figure 2.2, we see that in the next day after each nigh visit, one of the
visitors (the one who came later) appeared at one of the BFMO places.
3) Night visits to work (GasTech)
Figure 2.4. By spatio-temporal filtering, we have selected the
visits to the work place (GasTech) in unusual times, i.e., not starting between
6AM and 18PM. Alcazar visited GasTech four times in the night time. Suspicious
is also the visit by Truck 104 on January 16 at 20:00 for 15 minutes. On
January 13, Truck 101 came unusually late.
4) Midday visits of two employees to the hotel
Figure 2.5. The table view
shows the visits to the hotel selected through spatial filtering. Tempestad and
Borrasca met four times in the hotel at about lunch time on January 08, 10, 14,
and 17. The remaining visits belong to Sanjorge Jr., who, evidently, stayed in
the hotel from January 17 till 19.
5) Overnight stay of Nubarron at Kronos Capitol
Figure 2.6. We have selected the visits that lasted for 6 or more
hours, such that the visited place was not home, work, or hotel. Besides the
already known visits to colleagues and a BFMO place, we see a visit of Nubarron
to Kronos Capitol, which lasted almost 24 hours. However, it may mean that
Nubarron left his car at the Capitol and moved without it during this time.
Figure 2.7. The table below shows who else visited Kronos Capitol
on January 18. One of the visitors, Bodrogi, is the security employee known for
his visits of BFMO places and night visits to colleagues. It may be that the
long stay of Nubarron’s car at the Capitol is related to his meeting with
Bodrogi.
6) Two homes of Hennie Osvaldo
For Hennie Osvaldo, two places
have been classified as home places based on the temporal patterns of the place
visits. The place visited slightly more frequently was labeled “home” and the
other place “home 2”. “Home” coincides with the home places of Dedos and B.
(Birgitta) Frente, and “home 2” coincides with the home places of Bodrogi,
Ferro, and I. (Isia) Vann. Note that Bodrogi, Ferro, I.Vann, and Osvaldo are
security employees known for their strange activities reported in sections 1
and 2.
Figure 2.8. The time histograms show when Osvaldo visited her two
homes during the two week period.
Figure 2.9. The semantic space map shows aggregated movements of
Osvaldo. It shows that Osvaldo always returns from the work to “home” rather
than to “home 2”. From “home”, she goes for lunch/dinner, and then often goes
to “home 2”, which may be the home of Osvaldo’s partner. Moves to
breakfast/coffee before work occurred more frequently from “home 2” than from
“home”.
7) Strange movements of trucks
By the end of the data time span,
some trucks made strange movements back and forth along the same routes without
stopping: (1) truck 101 driven by Albina Hafon in the afternoon of Monday,
January 13; (2) truck 107 driven by Irene Nant in the afternoon of Wednesday,
January 15; (3) trucks 104, 105, and 106, driven by Henk Mies, Valeria Morlun,
and Dylan Scozzese, respectively, in the afternoon of Thursday, January 16; (4)
truck 107 driven by Cecilia Morluniau in the midday of Friday, January 17.
Figure 2.10. Screenshots of a space-time cube display showing the
strange truck trajectories. The colors red, green, blue, yellow, and magenta
correspond to trucks 101, 104, 105, 106, and 107, respectively. Stops are
manifested by vertical segments of trajectories. To make the stops easier
noticeable, we have extracted the points where the trucks stopped for at least
one minute. These points are represented by violet balls. The display clearly
shows that the trucks did not stop when repeatedly moving back and forth.
8) Relationships between people
From the trajectories, we have
extracted all meetings of the people and excluded the meetings that occurred at
work and the meetings of people living together at their homes. From the
remaining meetings, we have computed distances between individuals based on the
relative frequencies of their meetings.
Figure 2.11. The map display shows the space of inter-personal
relationships. The 2D projection has been obtained based on the pair-wise
distances between the individuals. The dots represent the individuals and are
colored according to their employment types. The curved connecting lines
represent the strengths of the relationships between the individuals (i.e., the
relative meeting frequencies) by proportional widths and opacities. We see a tight
group of security employees (the group also includes two non-security persons).
Two security persons, Cocinaro and Osvaldo, bridge this group with another
tight group made by engineering and information technology employees. Another
group of engineers is relatively separated from the latter group and from the
security group but has strong links to executive staff.
Figure 2.12. In the same map display as before, the connecting
lines represent the absolute numbers of meetings. We have applied filtering to
see the people met by Sanjorge (Jr.) during his stay in Abila from January 17
till 19. Here the maximal number of meetings is 2.
MC2.3 – Like most datasets, the data you were provided is
imperfect, with possible issues such as missing data, conflicting data, data of
varying resolutions, outliers, or other kinds of confusing data. Considering MC2 data is primarily
spatiotemporal, describe how you
identified and addressed the uncertainties and conflicts inherent in this data
to reach your conclusions in questions MC2.1 and MC2.2. Please limit your response to no more than
five images and 300 words.
1) Track of E. Orilla
Figure 3.1. The track of Elsa Orilla, which is represented by the
blue line, is extremely noisy (zigzagged).
A: By comparing the stop
positions extracted from the trajectory (orange dots) with the position of
GasTech and the positions of the businesses where E.Orilla paid by her credit
card (light blue circles), we see that the positions are systematically shifted
to the northwest. The positions of the businesses have been earlier determined
based on the stop positions of the other employees.
B: We have modified all positions
in E.Orilla’s trajectory by subtracting the average deviations of the
longitudes and latitudes of the stop positions from the locations where they
were supposed to be. Although the new extracted stop positions (orange dota)
are scattered, due to the noise, the clusters cover or overlap with the real
positions of the places visited by E.Orilla. To deal with the noise when
extracting the repeatedly visited places of E.Orilla, we have set a
sufficiently large distance threshold (150 m) for the stop point clustering.
2) Finding meetings of people from noisy data
In finding meetings of people, we
accounted for the possible positioning errors by giving a sufficiently large
threshold (150m) for the spatial distance between positions of different
individuals.
Figure 3.2. The map and the space-time cube show the meetings of
two or more people extracted from the set of trajectories. The meetings are
represented by spatial buffers in red; in the STC, the buffers are extended
vertically proportionally to the meeting durations. Cyan circles on the map and
balls in the STC represent the stop positions. In finding the meetings, we
excluded the stop positions at GasTech, where all employees regularly met.
3) Missing data
Large spatial gaps between
consecutive position records indicate that many intermediate positions are missing.
Such cases manifest themselves on a map as long straight trajectory segments
that do not follow any streets. When positions are missing, we cannot determine
the durations of staying at the last recorded locations and cannot know where
the vehicles were between the recorded positions.
Figure 3.3. The map shows the trajectory segments where the
distances between the consecutive points exceed 250 m. The dots mark the
starting positions of the segments, i.e., the last positions before the gaps.
Different colors correspond to different individuals/vehicles: blue to Calzas,
pink to E.Orilla, and red to truck 107. For Orilla, there was only one gap
between the first record and the remaining part of the trajectory. For Calzas,
there were 28 gaps that repeatedly occurred throughout the time period of the
data. The position recording was often interrupted at Calzas’s home and some
public places, but there were also breaks in other locations. For truck 107,
there were 7 gaps that began at either GasTech or Carlyle Chemical Inc.; each
time, the next recorded position was at the other place of the two.
4) Wrong daytimes in card transaction data
The daytimes of the card
transaction records from the coffee shops “Bean There Done That”, “Brewed
Awakenings”, and “Jack''s Magical Beans” are always the same: 12:00:00.
Therefore, it was not possible to determine the positions of these coffee shops
from the positions of the people who paid by their credit card at the moments
of payment. To determine the positions of the coffee shops, we considered the
spatial clusters of stops in the places that were not identified yet and
compared the lists of the people who stopped with the lists of the coffee shop
visitors who paid by credit cards.
Figure 3.4. The spatial clusters of stops that have been referred
to three coffee shops based on comparing the lists of stopping people and the
lists of credit card payments in each day.
5) Matching trucks to drivers and determining
the locations of the businesses visited by the trucks
We have extracted stops of trucks
from the truck trajectories and compared the times of the stops with the card
transaction times of the people who did not use GasTech cars privately at
different businesses. This allowed us to determine the probable drivers of the
cars (there are the people who paid by cards during the truck stops) and the
spatial locations of the businesses that were visited only by trucks (from the
spatial positions of the stops). The businesses include Abila Airport, Abila
Scrapyard, Carlyle Chemicals inc., Kronos Pipe and Irrigation, Maximum Iron and
Steel, Nationwide Refinery, and Stewart and Sons Fabrication.
Figure 3.5. The map shows the spatial clusters of truck stops
that occurred in places not visited by personal cars. The place names have been
determined by matching the times of the stops with the times of card
transactions of people who did not use personal cars.